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Abstract 


Microsatellite loci were screened from the genomic data of Dysomma anguillare and their 
composition and distribution were analysed by bioinformatics for the first time. The results 
showed that 4,060,742 scaffolds with a total length of 1,562 Mb were obtained by high- 
throughput sequencing and 1,160,104 microsatellite loci were obtained by MISA screening, 
which were distributed on 770,294 scaffolds. The occurrence frequency and relative 
abundance were 28.57% and 743/Mb, respectively. Amongst the six complete 
microsatellite types, dinucleotide repeats accounted for the largest proportion (592,234, 
51.05%), the highest occurrence frequency (14.58%) and the largest relative abundance 
(379.27/Mb). A total of 1488 microsatellite repeats were detected in the genome of D. 
anguillare, amongst which the hexanucleotide repeat motifs were the most abundant (608), 
followed by pentanucleotide repeat motifs (574), tetranucleotide repeat motifs (232), 
trinucleotide repeat motifs (59), dinucleotide repeat motifs (11) and mononucleotide repeat 
motifs (4). The abundance of microsatellites of the same repeat type decreased with the 
increase of copy numbers. Amongst the six types of nucleotide repeats, the preponderance 
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of repeated motifs are A (191,390, 43.77%), CA (150,240, 25.37%), AAT (13,168, 14.05%), 
CACG (2,649, 8.14%), TAATG (119, 19.16%) and CCCTAA (190, 19.16%, 7.65%), 
respectively. The data of the number, distribution and abundance of different types of 
microsatellites in the genome of D. anguillare were obtained in this study, which would lay 
a foundation for the development of high-quality microsatellite markers of D. anguillare in 
the future. 
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Introduction 


Shortbelly eel (Dysomma anguillare Barnard, 1923) is a small-sized warm water eel that is 
widely distributed in the Indian Ocean and the western Pacific Ocean (Nelson et al. 2016). 
In China, it is also one of the preponderant bycatch in the offshore waters of the southern 
East China Sea (Zhao et al. 2016). As an intermediate to high trophic-level species in the 
coastal food webs, it is of great significance in the offshore marine ecosystem and 
biodiversity. However, the limited studies of D. anguillare were mainly focused on the 
nutrition and feeding habits (Zhang and Tang 2003), the spatial-temporal pattern of 
community structure (Liu and Xian 2009) and the effects of lipid removal on the stable 
isotopes (Yang et al. 2020). 


The explicit germplasm genetic characteristics of fishery species are considered to be the 
indispensable prerequisite for effective fisheries management (Hemmer-Hansen et al. 
2018). However, the available genetic data for this species are still scarce and only partial 
mitochondrial and nuclear gene sequences have hitherto been reported and analysed 
(Chen et al. 2014, Chang et al. 2016, Wang et al. 2019). Microsatellite DNA, also named 
simple sequence repeats (SSRs) are short tandem duplications (typically 1-6 nucleotide 
repeats and mostly less than 100 bp in length), ubiquitous occurring in eukaryotic 
organisms. Besides, the repetitions vary drastically amongst different genotype of the 
same species (Tautz and Renz 1984). The co-dominant microsatellite molecular markers, 
based on polymerase chain reaction (PCR) techniques, have overriding advantages in high 
polymorphism, good repeatability, simple operation and low experimental cost. Therefore, it 
has possessed important applied worth in gene mapping and QTL analysis, population 
genetics and evolutionary research, as well as molecular marker-assisted breeding 
(Messier et al. 1996, Schlotterer 2000). At present, the conventional development 
strategies of representative microsatellite loci mainly include anchored-PCR-based 
method, selective hybridisation enrichment method, database search and relative species 
selection method (Sun et al. 2009). Nevertheless, these above-mentioned technical means 
not only are time-consuming and expensive, but also reflect incomplete distribution of 
microsatellites and develop limited molecular markers. 
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In recent years, along with the rapid progress of high-throughput sequencing (HTS) 
technology and the reduction of sequencing cost, developing numerous high-polymorphism 
SSR markers from multi-omics data has become more and more convenient. In this study, 
the genome-wide sequences of Dysomma anguillare were obtained, based on HiSeq ™ 
4500 platform for the first time; meanwhile, the SSR loci distribution and characteristics 
were also analysed by bioinformatics tools. The findings will help to provide useful 
references and basic information for germplasm resources conservation, population 
genetic evaluation and phylogenetic relationships analysis amongst related species of 
Anguilliformes. 


Material and Methods 
Sample collection and genomic DNA extraction 


Fifty-three samples of Dysomma anguillare were collected by trawling in the coastal waters 
of Zhoushan, Zhejiang Province in September 2022. After preliminary morphological 
identification, muscle tissues from five male and five female individuals were randomly 
selected for the genomic DNA extraction by the traditional Tris-saturated phenol method 
(Maniatis et al. 1982). Subsequently, the DNA barcode method, based on the mitochondrial 
COI sequence, was further conducted to ensure the species accuracy . The 1% agarose- 
gel electrophoresis and NanoDrop 2000 ultraviolet spectrophotometer (USA, Thermo 
Fisher Scientific) were performed to detect the integrity and purity of the genomic DNA, 
respectively. The obtained DNA samples were stored at -20°C for further analysis. 


Library construction and high-throughput sequencing 


Equal amounts of DNA (2 yg each) were mixed for library construction and next-generation 
sequencing by Onemore Technology (Wuhan) Co., Ltd. The genomic DNA was randomly 
fragmented using Covaris Ultrasonic Processor into small 200 to 350 bp fragments. Two 
pair-end DNA libraries were constructed through terminal repair, adding Poly-A tails and 
sequencing adapters, purification and PCR amplification and then sequenced using the 
Illumina HiSeq'™ 4500 sequencing technology. 


Sequence cleaning and genome assembly 


Raw data output from Illumina platform were firstly transformed into sequence reads by 
base calling and recorded in a FASTQ format. Subsequently, clean reads were obtained 
after filtering adaptor sequences and low quality read by Cutadapt v.1.16 (Martin 2011). 
SOAPdenovo v.2.04 was used to assemble the clean data with the setting parameters “-K 
53 -R -M 3 -d 1”, which employed the de Bruijn graph-based assembly strategy (Kajitani et 
al. 2014). First, reads sequenced from the small-fragment library were divided into smaller 
substrings (K-mers) to construct a preliminary de Bruijn diagram. Then, the simplified de 
Bruijn graph was obtained after removing the low-coverage branches and branches that 
cannot be connected further due to sequencing errors and the sequences at every 
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bifurcation locus were truncated to obtain the initial contigs. By mapping the paired-end 
reads back to the contigs, the connectivity relationships between the reads and the 
information of the inserted fragment size were used to further assemble the contigs into 
scaffolds and obtain the primary genomic sequence. 


Screening and identification of SSRs 


MicroSatellite identification tool (MISA) software (http://pgrc.ipk-gatersleben.de/misa/) 
written by Perl script was implemented to scan the assembled scaffolds to identify the 
genome-wide microsatellite repeat units and to analyse the length, location and quantity of 
the SSRs (Thiel et al. 2003). The occurrence frequency of SSR loci, average distribution 
distance and density of microsatellites, type and length of repeat motifs were calculated 
using Microsoft Excel 2019. The default parameters of MISA were set as follows: the 
repeat motif length was from 1 to 6 nucleotides and the minimum thresholds of repeat 
counts were 1-10, 2-6, 3-5, 4-5, 5-5 and 6-5, which meant the number of mononucleotide 
repeats was less than 10, number of dinucleotide repeats was less than 6 and numbers of 
remaining repeats were all less than 5, respectively. Besides, the number of bases 
interrupting two SSRs in a compound microsatellite should be less than 100. Considering 
the Watson-Crick complementary condition and the difference in the base arrangement, 
the repeat sequences and their complementary sequences were grouped together. For 
example, the (AC),, (CA),, (TG), and (GT), were treated as the same SSR repeat types. 


Results 
Genome sequencing and assembly 


The information of contigs and scaffolds of the Dysomma anguillare genome was listed in 
the Table 1. About 11,805,379 contigs with the total length 1,960 Mb were obtained after 
splicing and the average GC content was about 42.2%. The number of scaffolds produced 
by the SOAPdenovo v.2.0 assembly was 4,060,742 and the full length was 1,561 Mb, with 
the average GC content 39.6%. 


Table 1. 


The contig and scaffold assembly results statistics. 


Assembly — The totallength The sequence Lengthnumberof The maximum N50 N90 GC 


level (bp) number sequences length (bp) (bp) (bp) content 
2 2Kb (%) 

Contig 1,960,673,378 11,805,379 30,667 9,646 272 ~=60 42.2 

Scaffold 1,561,530,495 4,060,742 95,727 23,878 709 134 39.6 


N50 value is a widely used metric for measuring the quality of sequences by the assembly 
algorithms’ output. It refers to the contig or scaffold length value when the accumulated 
fragment length (from long to short) exceeds 50% of the total length of all contigs or 
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scaffolds for the first time. The greater the N50 value, the smaller the quantity and the 
better the assembly quality. In this study, the N50 values of contig and scaffold assembly 
were 272 bp and 709 bp, respectively. Compared with the assembled genomes of related 
species Anguilla japonica (Henkel et al. 2012), A. anguilla (Jansen et al. 2017) and A. 
rostrate (Pavey et al. 2017), the assembly effect of Dysomma anguillare was relatively 
good and developing microsatellite markers could reflect the genome-wide characteristics 
of SSRs. 


SSR repeat types and distribution 


A total of 1,160,104 microsatellites with 1-6 bp nucleotide motifs were detected in 770,294 
unigenes and 234,959 of them contained more than one SSR locus, with the occurrence 
frequency (total number of SSRs detected/total number of unigenes) of 28.57%. The 
density of distribution (total length of unigenes/total number of SSRs screened) was on 
average 1/1.35 kb and the relative abundance (total number of SSRs screened/total length 
of unigenes) was 743/Mb. 


These SSR loci can be classified into six repeat types: mononucleotide, dinucleotide, 
trinucleotide, tertranucleotide, pentanucleotide and hexanucleotide. The most abundant 
type of repeat motif was dinucleotide, accounting for 51.05% in the all SSR loci and then 
followed by mononucleotide (37.69%), trinucleotide (8.08%), tertranucleotide (2.71%) and 
pentanucleotide (0.25%), while hexanucleotide was the minimum (0.21%) of all (Fig. 1). 
The occurrence frequency of dinucleotide repeats was highest, while hexanucleotide was 
observed the lowest, representing 14.58% and 0.06% of the total genome, respectively. 
The relative abundance of dinucleotide reached 379.27/Mb, with an average of one SSR 
locus per 2.64 kb and the next was mononcleotide (280.00/Mb). By comparison, the 
relative abundance of hexanucleotide was the lowest (1.59/Mb) (Table 2). 


Repeat numbers of different SSRs 


The number of repeats of SSR loci mainly ranged from 5 to 24. The predominant repeat 
number of the SSR loci was 10 times, comprising 17.52% of the total number of SSR loci. 
In general, the number of repeat types decreased with the increase in repeat numbers (Fig. 
2). The repeats of mononucleotide, dinucleotide and trinucleotide were mainly distributed in 
10-19 times (96.83%), 6-15 times (95.15%) and 5-9 times (85.34%), respectively. However, 
the repeat times of the rest of the repeat types were all within 13 times, which were mainly 
in the range of 5-8 times and separately accounted for 92.40%, 96.70% and 99.56% (Table 
3). 


In summary, the repeat numbers of SSR loci were mainly concentrated in 10-15 times and 
5-8 times, with a total number of 1,016,359 (87.61%). Few SSR loci with more than 25 
repeats were identified and the type of base repeats was monotonous, only composing of 
mononucleotide repeat. 
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Copy numbers of repeat units 


Amongst the detected 1,488 repeat units, hexanucleotide repeats possessed the most 
types and pentanucleotide repeats took second place. Nevertheless, the type of 
mononucleotide repeats was the least limited to the base number (Table 4). Amongst all 
these repeats, the dominant repeat motifs in mononucleotide, dinucleotide, trinucleotide, 
tetranucleotide, pentanucleotide and hexanucleotide were A (191,390, 43.77%), CA 
(150,240, 25.37%), AAT (13,168, 14.05%), CACG (2,649, 8.14%), TAATG (119, 19.16%) 
and CCCTAA (190, 7.65%), respectively (Fig. 3, Table 4). 


Table 2. 


Proportions of each SSR repeat types in the genome of D. anguillare. 


Repeat type Number Occurrence frequency Relative abundance (per: Average length Total length 
(%) Mb ~') (bp) (bp) 
Mononucleotide 437,234 10.77% 280.00 0.87 379,455 
Dinucleotide 592,234 14.58% 379.27 0.50 298,350 
Trinucleotide 93,734 2.31% 60.03 0.72 67,533 
Tetranucleotide 31,481 0.78% 20.16 0.72 22,680 
Pentanucleotide 2,936 0.07% 1.88 0.82 2,409 
Hexanucleotide 2,485 0.06% 1.59 0.71 1,774 
Total 1,160,104 28.57% 742.93 4.35 772,201 
0.25% 


2.71% 0.21% 


™@ mononucleotide 
@ dinucleotide 

= trinucleotide 

@ tetranucleotide 
@ pentanucleotide 


@ hexanucleotide 


Figure 1. EES] 


Distribution of SSRs repeat types in genomes of D. anguillare. 
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SSR length distribution and polymorphism evaluation 


The sequence length amongst different types of SSRs varied a lot, from 10 to 54 bp (Fig. 
4). The minimum and maximum variations in length were detected in hexanucleotide and 
mononucleotide repeats, respectively. The former was in the range of 30-54 bp with the 
total length of 1,774 bp, while the latter was in the range of 10-51 bp with total length of 
379,455 bp, which constituted approximately 49.14% of the total length of SSRs. Amongst 
the six types of nucleotide repeat, dinucleotide and trinucleotide were dominant in the 
distribution of microsatellites from the perspective of sequence length, which were 677,805 
bp in total and accounting for 87.78% in all SSRs. 


Table 3. 


Distribution interval of the copy number in different microsatellite motif for D. anguillare. 


Repeat Mononu Dinu Trinu Tetranu Pentanu Hexanu Total Proportion 
number cleotide cleotide cleotide cleotide cleotide cleotide (%) 

0 32,413 17,143 2,071 972 52,599 4.53% 
6 0 162,916 18,834 7,300 498 394 189,942 16.37% 
7 0 101,359 13,353 3,184 176 270 118,342 10.20% 
8 0 72,287 9,325 1,460 94 838 84,004 7.24% 

57,111 6,070 895 46 11 64,133 5.53% 
10 152,127 46,524 3,962 594 51 0 203,258 17.52% 
11 90,414 38,631 2,619 422 0 0 132,086 11.39% 
12 58,458 30,798 1,896 430 0 0 91,582 7.89% 
13 40,161 23,987 1,488 53 0 0 65,689 5.66% 
14 27,543 17,686 1,203 0 0 0 46,432 4.00% 
15 18,717 12,237 1,451 0 0 0 32,405 2.79% 
16 13,469 8,578 1,069 0 0 0 23,116 1.99% 
17 9,925 5,919 51 0 0 0 15,895 1.37% 
18 7,271 3,962 0 0 0 0 11,233 0.97% 
19 5,276 2,745 0 0 0 0 8,021 0.69% 
20 3,800 2,077 0 0 0 0 5,877 0.51% 
21 2,605 1,480 0 0 0 0 4,085 0.35% 
22 1,889 1,344 0 0 0 0 3,233 0.28% 
23 1,297 1,670 0 0 0 0 2,967 0.26% 
24 853 878 0 0 0 0 1,731 0.15% 
25 697 45 0 0 0 0 742 0.06% 
>25 2,712 0 0 0 0 0 2,712 0.23% 


The length of the microsatellite was one of the main factors affecting its polymorphism. 
Temnykh et al. (2001) divided SSR sequences into two categories: the high-polymorphic 
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type | (length 2 20 bp) and the moderate-polymorphic type Il (12 bp < length < 20 bp). The 
microsatellites with length less than 12 bp owned lower polymorphism, but higher mutation 
potential. In the present study, there were 21,347 type | SSRs (19%) and 294,373 type II 
SSRs (54%), respectively. SSR loci with low mutation potential accounted for 27%. 


Discussion 


Number and relative abundance of microsatellites in the genome of 
Dysomma anguillare 


The bioinformatics software was used to search and analyse the various types and 
numbers of six perfect microsatellites in the genome of Dysomma_ anguillare. 
Approximately 1,160,104 microsatellite loci were revealed across the 1.56 Gb genome 
sequence, with a total length of 24,707,980 bp (occupying 58% of the full genome length). 
In contrast to other published genomes of bony fishes, it was higher than Takifugu rubripes 
(0.77%) (Cui et al. 2006), Scleropages formosus (0.78%) (Duan et al. 2019) and Bagarius 
yarrelli (1.23%) (Yang et al. 2021), but lower than Pelteobagrus fulvidraco (1.80%) (Xu et 
al. 2020) and Harpadon nehereus (2.01%) (Yang et al. 2021), indicating that genome-wide 
microsatellites content was not directly related to the genetic relationship and the reasons 
might involve different retrieval tools, parameter settings and databases (He et al. 2015). 
Hancock (1996) speculated that the numbers of microsatellites increased with the 
chromosome length and the disproportional relationship between the genome size and 
microsatellite numbers was also confirmed in our study. 
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Figure 2. EES] 


SSR repeats distribution of D. anguillare. 
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Table 4. 


Dominant base types and the proportion in genome of D. anguillare. 


Repeat type Number of Maximum Minimum 
t 
NBE® Repeat Number Proportion Repeat motif Number Proportion 
motif (%) (%) 
Mononucleotide 4 A 191,390 43.77 G 43,065 9.85 
Dinucleotide 12 CA 150,240 25.37 GC 604 0.1 
Trinucleotide 59 AAT 13,168 14.05 ACG 26 0.03 
Tetranucleotide 232 CACG 2,649 8.41 ACCC/ACTT / 1 0.00 
AGGT/CCAC 
ICCGAICGAT / 
TACG/TGGG 
Pentanucleotide 574 TAATG 110 19.16 - 1 0.17 
Hexanucleotide 608 CCCTAA 190 7.65 - 1 0.04 
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The distribution of microsatellite repeats in genome of D. anguillare. 
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Relative abundance was an important feature to measure microsatellite richness. It was 
calculated to be 743/Mb of Dysomma anguillare, which was much higher than that of other 
marine fishes, such as Scatophagus argus (653/Mb) (Wang et al. 2020), Cociella 
crocodilus (428/Mb) (Zhao et al. 2021), Tridentiger bifasciatus (347/Mb) (Zhao et al. 2022) 
and four species of pufferfishes (365/Mb in Takifugu rubripes, 369/Mb in Takifugu flavidus, 
397/Mb in Takifugu bimaculatus and 525/Mb in Tetraodon nigroviridis) (Xu et al. 2021). The 
above result showed that abundant microsatellites existed in the genome of D. anguillare, 
which would provide sufficient molecular markers for the further germplasm identification 
and genetic diversity studies. 
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Length distribution of genes in D. anguillare. 
A SSR length distribution; B Distribution types of SSR (type | and type II). 


Distribution characteristics of microsatellites in the genome of Dysomma 
anguillare 


Varied microsatellite types composing of 1-6 nucleotide repeats were discovered in the 
genome of Dysomma anguillare and dinucleotide repeats were the most frequent, followed 
by mononucleotide repeats, while the percentages of SSRs containing 3-6 nucleotide 
repeats were no more than 10%. Therefore, priority should be given to dinucleotide repeats 
when designing SSR primers of D. anguillare. Mononucleotide and dinucleotide repeats 
were regarded as the most abundant types of SSRs in most species. It was reported that 
mononucleotide repeats tended to dominate in the genomes of higher grade organisms 
(Gao and Kong 2005). However, dinucleotide repeats contained higher proportions in fish 
genomes, which probably related to the differences in gene expression and regulation. 


The CA repeat motif was the most abundant amongst dinucleotide repeats and occupied 
25.37% of them, which was consistent with Scophthalmus maximus (Ruan 2009) and 
pufferfishes (Cui et al. 2006, Xu et al. 2021), but different from /cta/urus punctatus (Tang et 
al. 2022), while the number of GC repeat motifs was the least. The base sliding might 
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generate microsatellites more easily at the low melting temperature (T,,). Two hydrogen 
bonds between A-T base pairs were more likely to be broken than three hydrogen bonds 
between G-C base pairs, resulting in reduction of the GC repeats (Huang et al. 2020). 
Some other scholars pointed out that the methylation of CoG might cause the spontaneous 
deamination of cytosine to thymine in order to maintain the thermodynamic stability of the 
DNA molecule. In this study, the proportion of GC repeats motif was only 0.1% and from 
this aspect, the lower GC content in the whole genome also reflected the small amount of 
GC repeats (Schorderet and Gartler 1992). 


The structural instability and composition of trinucleotide repeats were closely related to 
some genetic diseases in humans (Sinden et al. 2002). It was found that AAT repeat motif 
was the most numerous of the trinucleotide repeats in the Dysomma anguillare genome, 
the same as for humans and primates (Kelkar et al. 2008). Therefore, in-depth analysis of 
trinucleotide repeats would contribute to predict some gene loci associated with human 
diseases and thereby reduces the occurrence of certain illness by changing gene 
expression. 


Copy numbers and length variations in the genome of Dysomma 
anguillare 


The repeat unit length was in inverse proportion to the copy number of microsatellite DNA 
(Harr and Schlotterer 2000). Commonly, the higher the copy number of SSRs meant the 
more alleles and the richer polymorphism. The number of microsatellite repeats in the 
Dysomma anguillare genome was mainly in the range of 5 to 25. Motifs that showed more 
than 25 reiterations were very rare (only 2,712 SSRs) and all of them were composed of 
mononucleotide repeats. Previous studies proved that the mutation rate of microsatellites 
was positively correlated to the copy number of the repeat motif (Wierd! et al. 1997) and 
longer microsatellites were expected to have higher mutation rate owing to more chances 
of replication slippage (Calabrese and Sainudiin 2005). The results demonstrated that the 
number of SSRs decreased as the repeat number increased. In addition, tetranucleotide, 
pentanucleotide and hexanucleotide microsatellites might have higher mutation rates than 
those of the mononucleotide, dinucleotide and trinucleotide microsatellites. 


The length of microsatellites in the Dysomma anguillare genome was generally 10-18 bp 
and the number of microsatellites was inversely proportional to the repeat motif length. The 
structure and its characteristics analysis of a parthenogenic gastropod Melanoides 
tuberculata concluded that the longer the repeat sequence length was, the greater the 
selection pressure undergoing and the lower numbers of repeats was (Samadi et al. 1998). 
This phenomenon had been verified by various kinds of plants and animals, for instance, 
Juglans regia (Liao et al. 2014), Patinopecten yessoensis (Ni et al. 2018) and 
Phrynocephalus axillaris (Song et al. 2019). According to Temnykh et al. (2001), SSR 
polymorphism could be considered low, medium and high and the SSRs with lengths 
longer than 12 bp were potential molecular markers with high polymorphism. In the study, 
the type | and type II SSRs in the D. anguillare genome occupied about 73% of the total, 
showing great potential for polymorphism microsatellite development. 
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Conclusions 


In conclusion, MISA software was used for the first time to search and analyse six types of 
perfect microsatellite loci from the whole genome survey data of Dysomma anguillare. The 
results showed that both the relative abundance and density of various microsatellite types 
were very high. Amongst the 1,160,104 SSR loci, the number of different repeat types 
presented a trend as: dinucleotide > mononucleotide > trinucleotide > tetranucleotide > 
pentanucleotide > hexanucleotide. The dominant repeat motifs of them were A, CA, AAT, 
CACG, TAATG and CCCTAA, respectively. The results supplemented the genetic marker 
database of marine fishes and provided valuable information resources for further genetic 
analysis of D. anguillare. 
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